fix(databricks_zerobus sink): defer Unity Catalog schema fetch out of build()#25408
Conversation
30bffb2 to
83ebd94
Compare
petere-datadog
left a comment
There was a problem hiding this comment.
Tested, works as expected, code looks good to me
3e6c015 to
bdd9e09
Compare
|
please add |
It is added here: https://github.com/flaviofcruz/vector/blob/bdd9e0910af6bd3fd2c2459222f402d823b99858/.github/workflows/semantic.yml But I can merge a separate PR if that makes it easier. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f2cea4a861
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
f2cea4a to
9b65cc6
Compare
When the Tower retry budget is exhausted on a retryable failure (UC 5xx/408/429,
network blip, transient SDK error), the driver previously mapped the resulting
Err to EventStatus::Rejected — a permanent drop, indistinguishable from a UC
404 or an EncodingError. Ack-enabled sources and disk buffers had no way to
know a replay would have helped.
This change wires a small Tower layer outside the existing retry stack:
- Ok responses pass through unchanged.
- Err that downcasts to a retryable ZerobusSinkError (or doesn't downcast at
all — conservative) becomes Ok(ZerobusResponse::errored()), so the driver
sets Errored and the source / disk buffer may replay.
- Err that is permanent (EncodingError, SchemaError { retryable: false },
MissingAckOffset, non-retryable SDK errors) still propagates so the driver
maps it to Rejected.
ZerobusResponse now carries an EventStatus field with delivered() / errored()
constructors; ingest() returns ZerobusResponse::delivered(...).
Also brings unity_catalog_schema::status_is_retryable into line with Vector's
canonical RetryStrategy::Default by delegating to it. The visible effect: UC
501 (Not Implemented) is now classified as permanent (was retryable under the
prior blanket 5xx rule).
Co-authored-by: Isaac
9b65cc6 to
cd7fe9f
Compare
Summary
databricks_zerobus'sSinkConfig::build()synchronously calls Unity Catalog to fetch the table's protobuf descriptor. If the table doesn't exist or credentials are wrong, the call fails inside build() and Vector exits before the sink even starts — even whenhealthcheck.enabled= false.This aligns the sink with the convention used by AWS S3, Kafka, etc.:
build()does only local setup, the healthcheck does the remote probe, and runtime failures surface per-batch via the existing retry/event-status path.Changes:
Vector configuration
Before: Vector exits at startup with Unity Catalog API returned error 404: ....
After: Vector starts; batches are rejected at ingest time with the same error logged per batch.
How did you test this PR?
Change Type
Is this a breaking change?
Does this PR include user facing changes?
no-changeloglabel to this PR.References
Notes
@vectordotdev/vectorto reach out to us regarding this PR.pre-pushhook, please see this template.make fmtmake check-clippy(if there are failures it's possible some of them can be fixed withmake clippy-fix)make testgit merge origin masterandgit push.Cargo.lock), pleaserun
make build-licensesto regenerate the license inventory and commit the changes (if any). More details on the dd-rust-license-tool.